Sports Analytics with R

Stat 291

Spring 2018


Professor Bradley A. Hartlaub
Office 305 Rutherford B. Hayes Hall
Phone PBX 5405
e-mail hartlaub@kenyon.edu

Office Hours

Supplemental Text

Baumer, Benjamin S., Kaplan, Daniel T., and Horton, Nicholas J. (2018) Modern Data Science with R, Chapman & Hall/CRC Texts in Statistical Science
Goals

Title IX Reponsibilities

As a member of the Kenyon College faculty, I am concerned about the well-being and development of our students, and am available to discuss any concerns.  However, I want you to know that faculty members are legally obligated to share certain information with the Title IX coordinator.  This is to ensure the student's safety and welfare is being addressed, consistent with the requirements of the law. These disclosures include but are not limited to reports of sexual assault, relational/domestic violence, and stalking.

Statistical Package and Computing

The R statistical software package will be used throughout the course. Assignments and course announcements will be sent to you via e-mail or posted on the course web page. Data sets and Excel worksheets (csv files) will be placed in P:\Data\Math\Hartlaub\SportsAnalytics or a Google drive folder. Proper maintenance of computer accounts, files, etc. is your responsibility. I recommend that you back up your data sets, worksheets, and R scripts on a regular basis. I will not assume you have prior experience with statistical software so you do not need to be concerned about the use of technology in the classroom. R is free and you may download and use it on your own personal machine.

Our class meets in a seminar room, so your participation in class discussions is essential for success in this course. R statistical software will be used extensively throughout the course, but you may not have much time to work on scripts and notebooks during regular class hours. Group assignments will be a regular part of your work.

Learning Disabilities and Math Anxieties
If you have a disability and feel that you may have need for some type of academic accomotation(s) in order to participate fully in this class, please feel free to discuss your concerns with me in private and also identify yourself to Erin Salva, Coordinator of Disability Services at PBX 5453 or via e-mail at salvae@kenyon.edu.
Weekly Assignments: Position papers and activities

Weekly assignments will be given throughout the semester. Subsets of these assignments will be collected and graded. Peer assessment will also be used to provide feedback on position papers and other assignments.

The best way to improve your proficiency with R is to write programs. Comparing and contrasting different models and analyses will be a regular part of your work. Professional journal articles will be used rather than a tradition text. Your opinions on the views of the claims made by authors must be defended with appropriate data and analysis. Our goal is to improve your technical communication skills, both oral and written.

Late Policy

Assignments must be turned in at the beginning of the class period on the assigned due date. No credit will be given for late papers. If for any reason you cannot turn in your paper on the assigned date, you must contact me before class. If you are unable to contact me, you can leave a voice mail or send an e-mail message to hartlaub@kenyon.edu.
Exams
  • There will be no exams in this course.
  • Midterm Paper

    Your paper, based on a current journal article, will summarize a major issue in a professional sports. In addition to summarizing the article and the views of the authors, you will collect current data and fit appropriate models to make statistical inferences. Instead of collecting data, certain problems may be more appropriately addressed by simulating data. Your work with R will be used to support or refute the conclusions from the jounal article. Do the conclusions still hold or are they changing over time? If the conclusions have changed over time, are the changes the result of rules changes or some other factor? More details regarding the paper will be provided as we approach the midpoint of the semester. Your proposal, including the article you want to use for this assignment, is due on or before February 23. Your paper is due on or before March 2 (the last day before break).

    Quizzes

    Short quizzes based on the reading or R concepts will be given sporadically throughout the semester.

    Final Project

    Each student will find a journal article, supporting data set and apply an appropriate statistical analysis. The variables in the data set and the purpose of the study must be clearly defined. If the data is obtained from a periodical, the date of publication must be later than January 1, 2013. Summaries of your proposed analysis must be submitted on or before Monday, April 30. Final papers explaining the problem of interest, your analysis, and your conclusions must be submitted on or before Wednesday, May 9 at 8:30 am. A short presentation to the class, perhaps in the form of a poster session, will also be required.

    Group Presentations

    You will be responsible for preparing and delivering presentations to the class. Some of these presentations may be traditional presentations with PPT (or other slides) and others will be in debate, poster, or podcast format. Our goal is to refine your ability to summarize your position on technical issues succinctly. You may work with one, two, or more of your peers on these group presentations.
    Grades
    Your course grade will be based on your overall percentage. The categories below will be weighted equally to determine your overall percentage.
  • Weekly Assignments
  • Quizzes
  • Presentations and Class Participation
  • Midterm Paper
  • Final Project
  • Course Description

    Sports analytics are being used more frequently to help managers and owners make important decisions. Billy Bean was one of the first general managers to implement statistical methods and models to MLB. Now, similar models and methods are being used in basketball, football, hockey, soccer, golf, swimming and other sports. Using data science techniques to scrap data from appropriate sources has been a game changer for many analysts who are always trying to get an advantage on their competitors. We will carefully examine the statistical methods that are being used. In addition to analyzing individual and team performance over time, we will look at the impact of rule changes and new guidelines or draft policies. Students will read current journal articles from sports statistics journals and analyze data to address open questions of interest. Oral and written communication about these technical models will be a regular part of the course. Students will regularly be using R to analyze data and make inferences. Statistical methods for analyzing time series data will be a major part of this course. Prerequisite: MATH/STAT 106 or 116 and another statistics course or permission of instructor.